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ABSTRACT 

Sequence analysis of a substantial part of the 
polymerase gene of the murine coronavirus MHV-A59 
revealed the 3' end of an open reading frame (ORFIa) 
overlapping with a large ORF (ORFIb; 2733 amino 
acids) which covers the 3' half of the polymerase gene. 
The expression of ORFIb occurs by a ribosomal 
frameshifting mechanism since the ORFIa/ORFIb 
overlapping nucleotide sequence is capable of inducing 
ribosomal frameshifting In vitro as well as in vivo. A 
stem-loop structure and a pseudoknot are predicted in 
the nucleotide sequence involved in ribosomal 
frameshifting. Comparison of the predicted amino acid 
sequence of MHV ORFIb with the amino acid sequence 
deduced from the corresponding gene of the avian 
coronavirus IBV demonstrated that in contrast to the 
other viral genes this ORF is extremely conserved. 
Detailed analysis of the predicted amino acid sequence 
revealed sequence elements which are conserved in 
many DNA and RNA polymerases. 

INTRODUCTION 

The genome of mouse hepatitis virus (MHV), a coronavirus, 
consists of an infectious single stranded RNA molecule of 
approximately 30 kb in length (1). After entry the viral genome 
is released, translated into the RNA dependent RNA polymerase 
and subsequently used as the template for the transcription of 
negative stranded RNA of genome length (2, 3). This RNA then 
serves as a template for the synthesis of six subgenomic mRNAs 
and genomic RNA. The subgenomic mRNAs form a 
3'-coterminal nested set. An unusual feature of the mRNAs of 
coronaviruses is the presence of an identical leader sequence. 
This common leader is proposed to result from a unique leader- 
primed transcription mechanism. The transcription and translation 
strategy of coronaviruses has recently been reviewed in detail (4). 


The RNA dependent RNA polymerase of coronaviruses is a 
multifunctional protein: it contains the activities necessary for 
the transcription of negative stranded RNA, leader RNA, 
subgenomic mRNAs and progeny virion RNA. In addition it is 
likely to possess capping activity. These activities and the 
protein(s) on which they reside are poorly characterized. 
Complementation studies using temperature sensitive (ts) mutants 
which are defective in RNA synthesis revealed six different 
complementation groups, indicating that a large number of genes 
or at least activities are involved in the synthesis of the viral RNAs 
(5, Van der Zeijst, personal communication). Several authors 
have shown the presence of membrane associated RNA dependent 
RNA polymerase activity in lysolecithin permeabilized MHV- 
A59 infected cells (6, 7) or cytoplasmic lysates of MHV infected 
cells (8, 9). Brayton et al. (10) have described one early and 
two late polymerase activities in lysates of MHV-A59 infected 
cells. The early polymerase activity was shown to be involved 
in the synthesis of negative stranded RNA, while the late RNA 
polymerase activities were responsible for the synthesis of 
genomic RNA and subgenomic mRNAs. 

In vitro translation of genomic RNA of MHV-A59 resulted 
in the synthesis of a protein with a molecular weight exceeding 
200 kd (11). In vitro this protein is cleaved into a 220 kd and 
a 28 kd protein. The p28 protein is the N-terminal cleavage 
product of the precursor protein (11, 12) and has also been 
identified in MHV-A59 infected cells (13). 

The nucleotide sequence of the gene encoding the RNA 
polymerase (pol) of the avian coronavirus infectious bronchitis 
virus (IBV) has been determined (14). The pol gene, which is 
about 20 kb in length, contained two large open reading frames 
(ORF) ORFIa and ORFIb (previously termed FI and F2) which 
potentially encode polypeptides of 441 kd and 300 kd, 
respectively. The ORFs overlap by 42 nucleotides, ORFIb being 
in a -1 reading frame with respect to ORFIa. Brierley et al. 
(15, 16) showed that a cDNA fragment spanning the 
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ORFla/ORFlb overlap was able to direct ribosomal frameshifting 
in vitro and in vivo. 

Recently, we have completed the molecular cloning of the 
genome of MHV-A59 and determined the nucleotide sequence 
of the p28 coding region at its 5' end (1). Here we present the 
predicted amino acid sequence of ORFlb and of the carboxyl 
terminal region of ORFla of MHV-A59. In addition, we 
demonstrate that a conserved nucleotide sequence of the 
ORFla/ORFlb overlap directs ribosomal frameshifting in vitro 
and in vivo. 

MATERIALS AND METHODS 

Isolation of viral genome RNA 

Virus obtained from roller bottle cultures of Sac(-) cells infected 
at 2 p.f.u./cell was purified on sucrose gradients as described 
before (17). Viral genomic RNA was isolated as described 
previously (18). 

cDNA synthesis and cloning 

First and second strand cDNA synthesis were carried out as 
described by Gubler and Hoffman (19) using viral genomic RNA 
(50 /tg/ml) as a template and calf thymus pentanucleotides 
(100 jtg/ml) as random primers. Reverse transcriptase was 
obtained from Promega, RNasin from Antersham, Escherichia 
coli DNA ligase from New England Biolabs, RNase H and DNA 
polymerase I from Boehringer. After phenol/chloroform 
extraction and ethanol precipitation approximately 0.3 /tg cDNA 
was used for homopoly meric tailing using dCTP (20). Samples 
were taken every 30 seconds during tailing and the reaction was 
immediately stopped by adding 0.1 volume of 1 % SDS containing 
100 mM EDTA. After phenol extraction and ethanol 
precipitation, the tailed cDNA samples were annealed to Pst I 
digested, oligo-dG tailed pUC9 (Pharmacia). For transformations 
(21) Escherichia coli strain JM109 was used. 

Subcloning for MI3 sequencing 

DNA restriction fragments were separated by agarose gel 
electrophoresis and isolated by binding to NA45 membranes 
(Schleicher and Schuell). Purified fragments were recloned in 
M13 vectors. When no convenient restriction enzyme sites were 
available plasmid DNA was digested with Rsa I and the DNA 
fragments were directly subcloned in Sma I cut M13mpl9. M13 
clones were selected by hybridization to nick translated purified 
DNA fragments of the region to be sequenced and screened by 
single track sequencing. 

DNA sequencing 

Single stranded DNA from M13 clones was sequenced using the 
Klenow fragment of DNA polymerase I (22) and [ 32 P]a-dATP 
or T7 DNA polymerase (23) (Sequenase, US Biochemicals) and 
[ 35 S]a-dATP. Double stranded DNA was sequenced using T7 
DNA polymerase according to the instructions of the 
manufacturer. Sequence data were analyzed using the computer 
programs of Staden (24) and Wisconsin (version 5, 1987)(25). 
Comparison to the National Biomedical Research Foundation 
(NBRF) protein identification resource was made using the 
program FAST A (26). 

Construction of a plasmid for the analysis of ribosomal 
frameshifting 

To study the potential ribosomal frameshifting in the pol gene 
of MHV-A59 the expression vector pBBMAC was constructed 


as follows. Plasmid pMAC (a gift of Peter Rottier), which 
contained a copy of the MHV-A59 membrane (M) protein gene 
of which the region encoding the amino acids 121 up to 196 (27) 
was deleted (full details on pMAC will be published elsewhere), 
was digested with BamH I and filled in using the Klenow 
fragment of DNA polymerase I. The M protein encoding DNA 
fragment was purified and ligated into Sma I cut Bluescribe( + ) 
vector (Stratagene, USA). A clone containing the MAC coding 
region downstream of the T7 promoter was selected. The unique 
Sma I cleavage site of the resulting expression vector pBSM AC 
was converted into a Bgl II site by an 8-mer linker addition, 
resulting in pBBMAC. 

Clone P638, which contains the ORFla/ORFlb overlap (Fig. 
1), was digested with Pst I. The 1.3 kb cDNA insert was purified, 
digested with Alu I and ligated into the filled-in and 
dephosphorylated EcoR I site of a Bluescribe plasmid. After 
transformation bacterial colonies containing the 160 bp Alu I 
fragment spanning the ORFla/ORFlb overlap were selected by 
colony hybridizations (28) using a nick-translated BamH I—Kpn 
I 850 bp cDNA fragment of clone PI 136 as probe. Clones 
containing the expected 160 bp insertion were sequenced. The 
resulting plasmid pAPO was digested with EcoR I and made blunt 
ended with Klenow fragment of DNA polymerase. After ligation 
to 12-mer BamH I linkers and digestion with BamH I the DNA 
fragment was purified and ligated into Bgl II cut, 
dephosphorylated expression vector pBBM C. Plasmid DNA 
isolated from the resulting transformants was sequenced to 
determine the orientation and the borders of the polymerase 
ORFla/ORFlb overlapping fragment of several clones. Clone 
pAPl was found to be correct and used for analysis of the 
potential ribosomal frameshifting. 

In vitro transcription and translation 

Plasmid DNA of pAPl was purified on a CsCl gradient (29) and 
linearized with Hind IQ. In vitro transcription and translation were 
performed as described (30). 

A 1 2 3 4567 



Figure 1. Cloning and sequencing strategy of the ORFlb region of the MHV- 
A59 polymerase gene. A) The upper line represents the MHV genome. Vertical 
bars and die numbers above indicate the junction sequences involved in the initiation 
of the transcription of the corresponding mRNA. Open boxes represent the open 
reading frames in the polymerase gene. B) The open bar represents the sequenced 
region of the polymerase gene. The black triangle and the open triangle indicate 
the start of the large 3'ORF (ORFlb) and the junction sequence for the initiation 
of mRNA2 transcription, respectively. The positions of oligonucleotides A and 
T12 are indicated. Negative numbers mark the distance to the start of the poly(A)- 
tail of the genome. The numbered bars refer to the cDNA clones used for 
sequencing, the sequenced areas are indicated in black. 
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In vivo expression of pAPl 

Hela cells (2 x 10 6 ) were infected with recombinant vaccinia 
virus vTF7 —3, which contains the T7 RNA polymerase gene 
under the control of a vaccinia promoter (31) at a m.o.i. of 10 
p.f.u. At 90 min. post infection (p.i.) the cells were transfected 
with pAPl (5 ng) as described by Gorman (29). At 14 hr. p.i. 
the cells were labelled for 30 min. with 60 /iCi/ml 
[^SJ-methionine (32). Cell lysates were prepared (33) and 
clarified for 90 min. at 12.000xg (4°Q. 

Immunoprecipitations of proteins 

Immunoprecipitations of the [ 35 S]-methionine labelled proteins 
were performed as previously described (32); 4 /d of in vitro 
translation mixtures or 150 jtl of the cell lysates were used. A 
monoclonal antibody (moab) J.1.3 (a gift of John Fleming and 
Stephen Stohlman; ref. 34) directed against the aminoterminal 
region of the M protein of MHV-A59 and an anti-peptide serum 
raised against the carboxyl-terminal 18 amino acids of the M 
protein (Rottier, manuscript in preparation) were used as MHV 
membrane protein specific antisera. 


RESULTS 

Molecular cloning of the polymerase gene 

Initially we used a synthetic oligonucleotide complementary to 
the conserved junction sequence immediately upstream of the 
nucleocapsid gene to prime the cDNA synthesis on purified 


1 VTHLAtVWPLlLSRRPTTRQOSYCCAS V C I Y C R 
1 TCTGACtATCCTCCCACTCCTATCCC£ATTACTATTAA£CCCG4£CCtAA£CACTAATCAGSATTCTTATGCTCCTCCT7CCGTTrcTATATAT7Ctt£ 

34 SR v' £ Hp'oVDfiL Ck'lRC’kfVQVPi'gik’oPVSYVL 
100 CTC£CCTCTTCAACATCCACATCTTCATCCATTCTGCAMT7ACECCGCAACT7TCTCCAACTCCCCTTA£CCATAAAACATCCTCTCTCATATCTGT7 

88 'l H o’l CQVCGf’wRo'gSCSCVC'tGS'qFQSRO T* ■ F L 

1 LFLCRHRLPVSVKRHELF 

199 GACCCATCATAfTTCTCACCTTTCTCCtTTTTGGCGAfiATGCTAGCTGTTCCTCTCTACGCACACGCTtCEACTTTtACTCAAAACACACtAACYTTn 

102 '« 5 F t M *’ * 

19 K fl IRCTSVHAILYPCASGIDTDYQIRAFOICRA 
298 AAACGSATTtEffiCTACAACTCTAMTCCCCCTCTTCTACCCT6TCCCACTSGCTTG£ACACT&ATCTTCAATTAA6GCCATTTCACATTTCTAAICCT 

52 I* R A 6 *1 GLYYKV*ICC*RFQRVDe‘ 0CR*KIDKF FV VK 

397 MTCSAGCTCCCATTGCTT7 CT ATT AT AM5T GAATTCCTCCCCrTTHACCCTCT AfiAT CACCACECCAACMCTTCWT AACTT CTTTCTT CTT AAA 

85 R 't n L E V T *' K E K 'e C Y E L T K* E C G *V V A E U E F* F T F 'o V 

496 ACAACTMTTTACAACTGTATAATAACGA&AACAATGCTATGACTTGACAAMGAATGCGCTCTTCTGCCTGA/CACGRCTTCnCACATTTGATCTC 

118 ECSAVPBIVRKOLSKFTHIOLCYAI* IIFORBDC 
595 rAfffARCTO6CTAgACACATACTCCCTAAAfiATCTTTCAAAtfTTTA£TATCTTACATCTTT6CTAT6CATT6CCTCATTTTCACCSCAAT6ATTCT 

151 STIICE ILLTYAE CEE SYFQKICOWYDFYERPDI I 
694 TCAACTCTTAACCAMTTCTCCT7A£ATATCCTCAtTCTCAAfiACTCrTACTTCCAAAAfiAACfiArrCCTATCATTTTCnCACMTCCTCATATAATT 

184 VVYKKL6P1FRR ALLNTAK FADALVEAGL VGV L 
793 AATCTCrATAAAAACCnCCTCnATATTTMTAfiACCtCfGCTTAACACTCCCAACTTTCCAGACCCATT/tfTC&RCCCHTCnACT/tfCTCTTTTA 

217 tid>qdlygqyy6fgd’fvk'tvpcccy’ava‘osyt 

892 ACACTTeATAATCAACArnATATCCTCAATCCTATUCTTTC&ACATTTTCTCAAfiACACTACCTCGnCTCCTG r ICCCCTCCCACACTCT7ATTAT 

250 synhphithchal’ ose‘lfvrcty‘ref‘oivqydf’ 

991 TCATATATCATCCCAATCCTCACTATCTCTCATCCCTTttATACTCAUICI IICIlAATCCTACTTATACGCAGTTTCACtTTGTTCACTATGATTTT 

283 T 0 F K L E Cf TRY FRMVSHTYNPITCEC E 00RC I I 

1090 ACTCATnCAAfiCTACAGCmTCACTAACTATTnAAGCATTGCACTATMCCTACCACCCSARCACCTCTGACTCC&AC&AT&NCACCTCCATTATT 

316 MCAAFRILFS0VLPRTCFCPLYRQIFVOGVPFV 
1189 CATTfiCSCCAATTTTAATAJACrTTTTACTATCCTCTTACnAACACnCTTTTfiGfiCC ICI ICrTACSCACATATTTCTGCArCCTGTTCCTTTCSTT 

349 Y 5 Ic'yHYKCLCVVh'piio'vOTHRYR'lSi'kOLLLY 
12BB STCTCSATCGCnACCATTATAAA£AAJTAfiCT(TTCTTATCAATATGtATCrttATACACArCCnATC£CnCTCTCrTAAfiCACTTCCTTTT6TAT 

382 AA0PAL BYASASAl’lDL RTCCFSVAA I T S C V* R F 

1387 GCTGCAfiACCrTCCCCTTCATCTCGCCTCTCCTACTCCACTGCTTMTTTCCSCACATCTTGTTTTACCCTTGCAfiCTATTACAACTGCCCTAMATTT 

415 QT YlPCBF RQDF YE F I LSKGL L R C C* S S Y 0 L KB F 

I486 CAAACASTTAAACCTCGAAATTTTAATCAfiCATTTTTATCACTTTAnTreACTAAACGCCTGCTTAAAfiAfiSIXASCTCCCTTMTTTCAASCACTTC 

448 FF TQ06HAA1 TDYRYYRYRLPTHYOl K Q L LFVl 

1585 ntTTlACSCASCATGCTAATGCTCCTATTACTCATTAIAATTATTACAACTATAATCTACCCACCATCCTGGATATTAAfiCAGTTCTTCTTTGTTTTA 

481 E V V I X V F E I YECGC I PAT Q* V I f *■ I Y D X SACY P F 

1684 GAA£TTCTTAATAA£TATTTTGASATCTATGASGCTGG£TGTAJACCCGCAACACAS£TCATTtTTAAlAATTATGATAAGAGTGCTGCCTATCtArn 

514 »R*FCKARIy'y£a'l$FEEQ 0*E IYAYTXRRV LPT L 
1783 AATAMrTTG6AAA9gTA9CrTCTAnAIGAfiCCAT7ATCATTT6A66AGCAfi6ATGAMTTTAI6CCTATACtAAA£SCAATdTCfCCCCA£CnA 

547 TQHRLK YA i'SAKHRARTVACVSI LSI* 8 TCRBF I 
1882 ACTCAAATCAATCTTAAATATBETATTAGTCrTAASAATAfiCSCCCCCACrCTTCCTCSTCTCTCTATTCTCASTAOATCACTCCAjAATCTTTCAT 

580 qkclksiaatrgv’pvvicitkfyccvddhlrrl* 

1981 CAAAAJnCTCTAMSASTATASCASCTACTCCCCCTCTTCrrCTACTTAIAfiCCACEAC&MffTTCTATCCCSCTTCCCATGAIATGTTACCCCCCCTT 

613 I XD‘vOSPYLh"gwo‘yPXCOIA*RPI|‘iIR I Y S S* l V l 

2000 AnAAAfiATCTTCAIASTCCTCTACTCAIGECTTGGCACTATCCIAAATGTGATCSTGnATQCtAAACATACIGOCTAJTGTTACTACIIIttlttTA 

648 AIYhOSCCSH TDRFTRL AIE CAQYLSElYHCCC 
2179 CCtCCTAAACATCATTCGTCCTCrTCCCATACGCAIAGATTCTATCCTCTTCCCAACCACTGCGCOQWCTTTTtACTCAMTTCTTATtTGTCCTSCT 

679 CYYy'kPg'gTSS60ATTA*FARSVFR*ICQ*AVSAIV 

2278 ICTTATTATCTTAAACCAECTCSCACTASTASTCGGtATGCAACCACTCCrnTCCTAATTCTGTCTTTAACAJTTCTCAAfiCTfTTTCCCCCAATCTA 


genomic RNA. Virus specific clones were mapped by dot blot 
analysis with the purified individual MHV mRNAs (data not 
shown). Oligonucleotide A was synthesized from sequence data 
obtained from a cDNA clone which only hybridized to genomic 
RNA. Oligonucleotide T12 was synthesized on the basis of the 
nucleotide sequence of RNase T1 resistant oligonucleotide T12 
(based on the nomenclature of ref. 35; unpublished results), which 
is located in the pol gene of MHV-A59 (36). Oligonucleotides 
A and T12 were used to screen the random primed genomic 
cDNA library. Several large cDNA clones (P095, P096 and 
P098) hybridizing to both oligonucleotides were identified (Fig. 
1). In addition several cDNA clones were identified which were 
only positive with oligonucleotide A e.g. P030 and P035. The 
cDNA clones P096 and P030 contain the complete coding 
sequence of mRNA 2 (37) hence they map to the 3' end of the 
polymerase gene. The cloning of the pol gene was completed 
by screening the cDNA library using a nick translated probe 
derived from the most 5' mapped cDNA clone on the viral 
genome. Clones P638, P737 and PI 136 were isolated from a 
second random primed cDNA library prepared against genomic 
RNA from an independent stock of MHV-A59 (1). 

Nucleotide sequence analysis 

The cDNA clones used to determine a substantial part of the 
nucleotide sequence of the pol gene of MHV-A59 are shown (Fig. 
1). Each nucleotide of the consensus sequence (Fig. 2) was 
determined on at least two independent cDNA clones. Analysis 

712 CSLHACRCHKl EDlSlRElQKRLYSIVYRAOHV 
2377 TGCTCGTTTATGGCAT £CAATtfA£A£AAAATTGAA£ATTTCACTATACtCGA£TT ACAAAACCSCCT ATACTCTAATCTCT AT CCTGCGGACCAT GTT 

745 D PAFVSEY YEF LURHF SUN 1 L SDOGVYC YIS EF 
2478 GACCCCGCATTTGTTACTGACTATTATGAGTTTTTAAATAACCATTTTACTATGATGArnTGACTGATGAIGCTETTCTCTCTIATAATTCAtAGTTT 

778 asksy'i a x *i safqqy'iy y'qrrwf n s' e a x 'c y y ET 

2575 GCCTCCAACGCTTAlATTGCTAATATAACTGCCTTTCAACACCTATTATATTATtAAAATAATCTCTTTATCTCTCAGGCCAAATCTTGGCTAMRACA 

811 0 I EKC PHEFCSQHTNLVRHOCDEVY LPYPDPS* 

2674 GACATCSAAAACG£ACCCCATGAATTTTCTTCTCAACATACAAT9CTACTCAACATGGAT9CTGATCAACTCTACCTTCCATACCCTGATtCTTCCACA 

044 I L CACCFVDQL L R T 0 S V l l I ERFYSLA I 0* A Y P *L 
2773 ATCTTAttRCCAOCICI11ICI IGATGATTTArTAAACACTCATACCCTTCTCTTGATAfiASCGTTTCSTAACTCrrCCAATTGATCCTTATCrTTTA 

877 V Y U E I p' E V Q '■ Y F R Y Y i E Y 1 *R R L Y R D l’ G M Q l L 0 5 
2872 CTATACCATtAGAACttACACTATCMAATGTGTTCCGCCTATATTTAGAATATATAAAGAACCTGTACMTGATCTCSCTAATTA&ATtCTCbRCACC 

910 ysy’iistcdcqrf'tde’tfyrihy'lis’avlqsvg' 

2971 TACACTGTTATTTTA4CTACTTGTGATGGTCAAAAfTTTACTGATGAAR£mTTACAAGAACArGTATTTAACAAGTGCAGTGCTGCAAAGC9TTGO 

943 ACv'vCSSQTS’lRCGSC i RKPLLCCRCAYDIYHS 
3070 GCCTGCSTTGTCTGTAGTTCTCAAACATCArTRCCTTCTGGCAGTTGCATACGCAAGCmTGCTGTGTTGCAAATGCSCCTATUU'CAJGnATGTCC 

•76 tohryyisvs’pyycrspccdvioytri'yic’cns 

3169 4CTWTCA T AAA T AT CTCCTCA6TCT CTCACCAIATGTCT CTMTTCACCCCCA T6TCATCT AAAI tATCTT ACCAMTTCT ATTTACCTGCT AT CTCA 

1009 yyce'dhr’pqysfrl’ymii'gnvfgiy'rqsctgspy 

32*8 TATTATTGTCACGACCAIAAACCACAGTATTCAnCAAATTCCTCATGAATGCTAT CC TT n ICC) 1 T ATAIAAACAATCTTCTAC TGGTTCCtCCT AC 

1042 i’edf’kkiasckwte’vooyvia'rec'ierlklf’aa 

3387 ATRCACATTTTAATAAAATACCTACTTCCAAATttACAGAACTCCATttATTATCTGCTACCTAATGAATCCAfCCRAfCCCTTAAATTCTTTCCCSCA 

1075 E IQRAT EEAFKQCYA5A t’| RE I V 5 6 R E L I IS WE 
3486 GAAACGCA£AACGCCACA£A4£RGGCCTTTAACCAATCTTATCCGTCA£CAACCATCCCTGAGATCCTGACCGATCSCCAGTTAArTTTArCTTGCCAA 

uoe (ckvr* pplikhyvftgyhftimcktvig'eyvfo 

3565 ATTGCTAAACTGAGACCACCACTTAATAAAAATTArGTTnTACTGCCTACCATTTTACTAATAATGCTAACACACTTTTACCTGACTAJGTTTTTGAT 

1141 R S* E l T RCVYYRa’t T T YKL SVCO’VF I L I SBAVS’S 
3664 AACA£TGAGTTG4CTAATG£TGTGIATTATCCCGCCACAACCACTTATAACTTATCTGTAGCTGArCT9TTCArTTTAACATCACACGCA6T6TCTA£T 

1174 ls‘aptlvpq*ciy*tsirfas*vys'vpitfqr*hvf'h 

3763 TTAAGTCncnACATTA£TACCGCACGAMATTAIAC74£CAI ICC 111 IC£IACTGTTTATA6TGTGCCTGACACCTTTCACAATAAJ9TGCCTAAT 

1207 Y 0 » 1 C n K R Y "c T Y0CPp’cTc’r5RLAI G "l AY YYCT 
3062 TATCAgACATTCtAATCAACCSCTATTCTACTCTACA GG 6A CCGCC TC6TACTCCTAACTCCCATCTACCCAT TC6CCT A 6CTGT I T AITATT6TACA 

1240 ARYVYT AASHAAVDAL CEKAHRf'lII IDCTR IV* 
3961 GCCC6CCT6CTGTATACCCCTGCTASCCATtCTttASnGACGCCT6TCT6AAAACGCACATAAATTTTTAAATATTAAT6ACrGOCGCCTATTCTT 

1273 PAR V RY OCYOt F R *Y I 0 T T R R* Y V F *T 1 I ■ A L P* E L Y 
4060 CrTGCAARCCTGCCTCTAGArTEnATGAIAAATTTAAGCTCAATGRCACCACTCOCAASTArGTGTTTMrrACAAlAMTGCATTMXTGACTTCGTG 

1306 f D I i YYDEYS0 1 T RYEL SY I RSRVRAKBYVY I G 
4159 ACTGPfAnATTCTCCTTGATGAACTTACTATGlCnRCCAACTATGRCCTGTCTGTTAnAACACTtCTGnAfiESCTAAfiCAnATCTCTATATTGSA 

1339 opaq'lpa’pryllik'gtl'epryfis’vtjllrcclg 

4250 CACCCTGCGCA£TTACCTGCACCACCTCTGCTACTtAATAA£GCAACTCTAGAACCTA£ATATTTTAATTCCCTTACCAA£CTAA7tTCTTCT 11CUI 

1372 POI FICTCTRC PRE IYDTY5A LYYRRKLKAK ID 
4357 CCASATATTTTCTTGSSCA£CTfTTATA5ATGCCrrAASSASAnGTCAlAaSTCTCASCCTTG5TTTATAATAATAA5rTGAA^XTAAAAATGAT 

1405 I SSHCFKv'yYKGQTTUESSSA VMHQQI HL 1S KF 
4456 AATASCTCCATGTGCTTTAA£CmAnATAA$tCtCAGACTACACATGA£A9TTCTAGTGCTGnAATATGCA£CAAATA£ATTTAATTA9TAA£TTT 

1438 LKARPSWSIAVF I SP YI5 QIYVAKR VLCLQT QT 
4555 TTAAA£CCAA4CCCCASTTGGA£TAACSCCCTATTTATIAGTCCTTATAAIA£TCASAACTATCTTGCIAASAGA£TCTTGG£ATTACAAACCCA£ACA 

!47i vosaqgse yof yi y s "q t aetahsvn’vmrf IV A* 1 

4854 GTASACTCAfiCGCASCCTTCTGAATATGAnTTCTTATTTATTCACAGACTGCGAARCAfiCGCATTCTCTCMTGTAMTASATTCAAIGTTGCTATT 

1504 trakkcilcvhssnqlfeslif'ttltldkimrp 
4753 acacctgctaacaasggtattctctgtgtcatgastastatgcaattatttcastctcttaattttactacactgaccttggataasattaacaatcca 
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1537 RLQCTTHLFKOCSASYVGYHPAHAPS f L A V 0 0 l 
4852 CSATT7CACT¥TACTACAAATTTtTTTAACtATTtTAfiCACCACCTATffTAC6ATATCMtCACCCnATCACCAJCCTTTrreCCACTT6ATlACAM 

1570 Y K V* 6 S 0 *L AVCL RVAOSAVTYSRL I S L ’ll C F K LOL 

4951 TATAAgCTACGCCCTEATTTA CCCC TTT GC CTTAATCTTCCTCA lTC T &C T C T C ACTTATTCCCCCCTTATATCflCTCArCCCATItAAfiCTTCACTTC 

1603 tld'cycklf itbd'eaikhyravvgfdaega'iai 

5050 ACCCTT9ATCCTTATTCTAACCTfnTATMCTAGAGATGAACCTATtAAACCrCTTAC^CCrrSC6TTS6CTTCCATCCACAAfiCTCCCCAT6CGATA 

Ifi36 I D S i t T I F P L 0 L t P S I t I 0 F 'll V f A T ( II F H I 0 ( 

5149 CfTGATACCATTCfiGACAAATTTCCCATTACAATTAtU, 111 TCGACTGGAATTGATTTTCTTGTCGAAGCCACTGGAAJCT I TGCTGAGAGAGAJGCJ 

1669 YYF KKAAARAPPGEQFKHL I PLHSRGQKWOVVR 
5248 TATCTCTTTAAAAACCCACCt£CA£CA£CTCCTCCTGCCGAACAATTTAAA£ACCnATCCCACTTATCTCAAGAS(5CACAAATCCCATGTCSTTCGA 

1702 I R 1 V *Q It L SOIL* VOL *ADSVVLV* TWA *ASF £ IT Cl R 

5347 ATTAGAATAC7ACWATfiTTCTCAGACCA£CTACTtCATTTGfiCA6ACA6TtTTfIACTTCTt4£CTCCCCTCC£ACnTTSACCTC>CATCTTTCCCA 

1735 Y V A K V C R E VVC’svCTKHa'tCf'hSITCYy'cCVRH 
5446 TATTTCGCTAAACTTGCAAfiACAAGTTGTGTCTAgTCTCTGCAGCAACCCTGCGACATCTTTTAATTCTAGRACTGCAIACrATCGAr C CTE Ct GACAT 

1766 sysco’yly’rpl ivof qqv’gytgsli’sri'dp ICS 

5545 AfiTTATTCCTCTCATT4CrTCTACAACCCACTAATAfiTTCACArTCA4£ACTCGCCATATACAfiGATCTTTAACTACCMTtlAT&ATCriATTTCCACC 

1801 VHRCAHV ASSOAI HTRCLAVMOCF CKSVNVHLE 

5644 CTCCAIMCCCTUrrCATCnCCATCATCTUkTGCTATCATCACCCCCTCTCTACCTCTTCATCATTCCTTTTCTAACTrrCTTAATTGCAATnAfiAA 

1834 Y P I I S HE VS VN1SCRLLQI VHFRAARLCIB Y 0 V 

57 4 3 TACCCCATTATTTCAMT6ACCTCACTfT7AATACCTCrTCCACCTTATTtCACCfiCffTAATCTTrACCCrreCtATGCTA'reCAAIA£CTATCAreT6 

1867 CYDI Gn‘rKc'iACVKGy'0Fk‘fY0ASPv'vkS*VKQF 

5842 TCnATCACAITCCCAACCCTAAAttICTTCCnCICTCWACCATATBAITTTAACTTTTATCATCCCTCCCCTCnCTTAAfiTCTCTTAWCAfTTT 

1900 Y YK YE AH KOQF LO'cLC'hFWRCIv'dKy'pahAVVC 
5941 CTTTATAAATA£CACCCACATAAACATCMTTmACATCCTTTCICTATCTTTTCCAACTCCAAICTCSATAACTATCCAGCCMrCCACTTCTtm 

1933 RFDTRVLRKL NLPGCHGCSL YVHKHAFHTSPFT 

6040 ACCTTTCACAMCSICTCTTCiWCMATTW^TCTCCCTCCCTCrMTCSrrGCCACTTTCTATGTTMCAAACArCCATTCCACACCACTCCCTTTACC 

1966 "r a a f e il kph'p ff yysd't pc'vyhe CHESttQ YOY 

6139 CC60C7CCCTTCCAC«AArn^CCCTAlCCCTTTCTmATTATTCAfiATACCCCCT6TBTCTATArCCMCCCATSlCAATCTA4CCACCTCSATTAT 

1999 vplb'sat'ci trcmlcga'vclkhae eykeyl esy 

6238 CTCCCATTOfiAAGCCCTACATOCATCACAAWTOCAATTTASCTCGCCCrCTTTCTITMAACArCCTCAttACTATCCTCAffTACCTTGACTCTTAC 

7032 H IAT T ASFIFH YYK TFOFYRl’ VHT FTfiLQSl C* 

6337 MrACfiECAACCA£ACCCC6TTnACTTTTTCCCTCTATAAWCTTTTCATTTT7ATAACrTTTCWATACTTTTACTACCCTCCAWtTTTAGAAAAT 

7065 V V Y HI VH A6HF 0 GRACE L PCA V 1 6 EKY 1 A K I QH 

6436 CT AST 6TAT AATTT CGTCMT CCT 6CACACTTTCATCCCCGGCCSCC7 ^CTCCCTT CTCCTCTT AT ACCT 6AGAAACT CATTCCCAAGAITCAAAAT 

2098 EOVVv'fKh'nTPFPTh’yAVELFAKRS’iRp'hPELK 
6535 CAfiCATtTCCTCCTCTTTAAAAATMCACCCCATTCCCCACTAATCTCSCTCTCUAATTATTTGCTAACCfiCAfiTATTCCGCCCCACCCCGACCTTAAfi 

2131 lf‘rhl’hiovcws‘hvi"wdyakd$vfc’sstykvcic 

6634 CTCTTTACAAATTTCAATAn6ACCTGTGCTC6AGTCACCTCCTTTCC6ATTATCCTAACCATACTCTCTTTTCCACTTCSA£CTATAACCTCTCCAAA 

2164 T T 0 L 0 C 1 ES* IRv‘lFDCRO|‘gAl’£AFKKCr‘hGy'y 
6733 lACACACATTTACACTCCATTCAAACCnCMTCTACTTTTTCATCCTCSTCATAATCCrCCTOTCAABCTTTTAACAACTCCCttAATCCCCTCTAC 

2197 1 H T T K I* K S L S H I K C P o' B A 0 'l H C V V V E~ K V 6 'd S 0 V 

6832 ATTAACACSACAMAATT A^AfiT CTCTCCATCATT AAA£ECECA£AA£CTGCCSATTTCAA7GCCCT A£TTGTGG4£AAACTT ttACATTCT CAT CT9 

2230 efv'favbkogoov'IFSRtgslep'sht'rspqghp' 

6931 CAATTTT GGTTTGCT GTGCST AAAGACGC7CACGATCTT AT CTT CACCCCT ACACGBACCCTTGAACCCACCCATT ACCGCACCCCACAACST AATCCS 

2263 CCN RVCOLSC HEA LARCT I F T Q S *R LLSS F T PR S 

7030 CC7GG7/M7CCC£7CCC7GATCfCACCCC7M7CAACC7CTACC£CCTCCCACT‘ATCrTTACTCMACCA&4T7A77ATCC7r7T7CACACrTCC4TCA 

2296 ’enekdfrold’dovf iak‘ysi‘qoyafei’vvy’gsf 

7129 T CCACAAACATTTT AT&CATT7 AtATCAT CATCT 6TTCA7TCCAAAAT A T ACTTT ACAttAO ACCCCTTT CAACACCTT GTTT A7 GCT AC7IT7 

2329 HQKI ICb'lhIL IG l‘ ARB ‘qQKSIl/iQe'f VITOS 

7228 AACCAGAAfiAnATTCCAGCTTTGCAnTCCnAnCCCTTACCCCCTACCCACCAAAAAICCAATCTGCTAATTtAAGiACTTCCTGACATACtACTCT 

2362 SIHSYFITOEHSCSSKSVCTYIDLLLDDFVOIV 

7327 ACCATTCATTCCTACrnATCACTCACSACAACACTCCTACTAGTAAMCTGTGTCCACTCTTAT7GATTTATTGTTACATGATTTTCTSCACATTGTA 

2395 KSL HI KCV*SKV*VHVRVOf‘rOF*QF H L V C h’ E E R V II 

7426 MCTCCCTCAATCTAAAGTCTCTGACTAAflCnCnAAT6TTAATCTTGATTTrAAAGATTTCCACnTAICrTCTCGTCCAAT G 4ttACAACgTCATG 

2429 7 FT PH I OAAAOVKPC YVHPVL fin’l JPUHI 

7525 ACmCTAT C CTC S TTTGCAG m CClGCT C ACTCCAAACCTCCTTATGnATGCCTGTCTTATATAACTATTTGGAATCGCCTCTGCAAACACTAAAC 

2461 L w' H Y C \ P I T IP TGCRHRVAKYTQl C Q T L S T T TL 

7624 nnCtMnAT6CCA4GCCCATTACTTTACCTACAGfiATGTATCATCAATGnCC7AACTATACTCAATTATGTCAATATTTCACACTA£AACATTA 

24*4 AVPARHB * l" H L G AG SDK GY* A P 6 S A V L R OWL Pa'c 

7723 GCAGTTCCCCCTAATATCCCTGTCTTACAC C n GC T tCCt CTTCG fi ATAA CGG I tl l 6CCCC rC GG TCTGCAJTTCTTAGCCA£TGCCTACCA6CGCCA 

2527 S ILYOHDVH PFYSOSVASY YCHC I TLPFOCQWO 

7822 ACTATTCTTCTACATAATCATCTGAATCCATTTCTGACTCACAGTGTCGCCTCATATTATCCMATTGTATAACCTTACCCTTTGArTGTCACTCttAT 

2560 l 1 1SDHY DPL T K R* I GE T HVSK0 6F FT YL CHI IR 

7*71 CTCATAATTTCTGATATCTACGACCrrrnACTAAfiAACAnGGCCACTACAPCGTGACTAAACATGGATTCTTTACTTACCTCTGTCATTTAATTCCT 

2593 0 K L *A L C 6 S V A I K TIE f 5 V ff A* C L YStRGKfAFVT 

8020 GACAAGTTCGCTCTGGCTGGCAGTGTTGCCATAAAAATAACAGAGTTTTCTTGGAACGCTGAGnAIATACmAATGGEGAACri TGCGTTCTGGACA 

2629 I FCTNVh’aSSSEGFL I C l HVLHKTRTE I DGKT8 

8119 ATCmTGCACCAACtTAAACSCCTCTTCAACTCAACCATTTTTGATTCCCATAMTTGCTTGAATAAGPCCCCTACCCAAATTGACCCTAAAACCATG 

2659 HAHy' L Fw'bhSTRVh’cCa’ySLFOHSIFPLIAACT 

8218 CATGCCAATTATCTCTTTTGGAMAATACTACAATGTGGAATGSACECCCTTACACTCTCTTTGACATG4CTA4CTTCCCTTTGAAACCGGCTGGTACE 

2692 A VYS LKPDQ I HOLY LSI iEK6*KlLVRDTIKEVF 

8317 GCTCTTCTTACCrTTAAACCAGACCAAATAMTCACTTACTCOCTCCnGATTGPGAACCCCAAGTTATTAfTGCCTGATACACGCAAACAACTTTTT 

2725 VGDSLYHv'x* 

•416 CTTGCCGATACCrrACTAAATGTCAAAT AAATCTATACT TGT 


Figure 2. Nucleotide sequence of the ORFlb region of the MHV-A59 polymerase 
gene and the predicted amino acid sequence of the major open reading frames. 
The most 5' AUG codon of ORFlb and the junction sequence between the pol 
gene and the mRNA 2 coding region are underlined. 


of the sequence data revealed the 3' end of an ORF (ORFla) 
partially overlapping with a large ORF (ORFlb) which covers 
the 3' half of the pol gene (Fig. 1). ORFlb has a length of 8199 
nucleotides and potentially encodes a protein of 309 kd (2733 
amino acids; Fig. 2). The first potential translation initiation 
codon is located at position 643 and the ORF terminates at 
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Figure 3. Comparison of the primary and secondary structure of the MHV-A59 
and IBV-M42 ORFla/ORFlb overlap A) Alignment of the nucleotide sequences. 
Numbers refer to the position of the 5’ G residue in the published sequence (MHV- 
A59, this article; IBV, ref. 14). B) Predicted secondary and tertiary RNA structure 
at the site of ribosoma) frameshifting. The potential codons involved in 
frameshifting are underlined, and the termination codons of ORFla are boxed. 
Solid lines illustrate the potential pseudoknot. Differences in the nucleotides 
involved in the predicted RNA structure of MHV in comparison to IBV are 
indicated in the predicted structure for IBV to illustrate the observed covariation. 

position 8443, which is just upstream of the conserved junction 
sequence AAAUCUAUAC. This region separates the gene 
encoding the 30 kd nonstructural protein from the pol gene (37). 
ORFla terminates at position 318 and overlaps ORFlb for 75 
nucleotides. 

Analysis of the ORFla/ORFlb overlapping sequence 

It has been shown that the nucleotide sequence of the IBV 
ORFla/ORFlb overlapping region is capable of inducing 
ribosomal frameshifting in vitro and in vivo (15, 16). A stable 
stem-loop structure in this overlapping region was predicted to 
be involved in this translational frameshifting. Comparison of 
the nucleotide sequence of the ORFla/ORFlb overlap of MHV- 
A59 to the ORFla/ORFlb overlapping region of D3V-M42 
revealed a well conserved stretch of nucleotides (Fig. 3A). Fig. 
3B shows that a nearly identical stem-loop structure can be 
predicted for the MHV ORFla/ORFlb overlap region. The 
insertion (MHV-A59) or deletion (IBV-M42) resulting in a gap 
of three nucleotides in the alignment (Fig. 3A) is located in the 
bulge of the stem-loop structure (Fig. 3B). Even if larger regions 
of the sequence (up to 500 nucleotides) were analyzed for the 
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Figure 4. Analysis of ribosomal frameshifting in vitro and in vivo. A) Diagram showing the expression plasmid pAPl and the predicted sizes of protein products 
that would be expected from translation of the transcribed RNA. The black area represents the T7 promoter, regions encoded by the mutant M protein gene are 
hatched. Stop indicates the positions of the translation termination codons. B) In vitro translation products were analyzed directly (lane 1) or after immunoprecipitation 
using moab J.1.3 (lane 2) or the carboxyl-terminal specific anti-peptide serum (lane 4), respectively. Lane 3 and 5; immunoprecipitations using the corresponding 
pre-immune sera. Q Lysates from pAPl transfected and vaccinia virus vTF7-3 infected cells were immunoprecipitated using moab J.1.3 (lane |) i the anti-peptide 
serum (lane 3) and the pre-immune sera (lane 2 and 4). M indicates molecular weight markers. 


presence of secondary structure the depicted hairpin structure still 
folds as a separate entity. 

In both IBV and MHV the translation termination codon of 
ORFla is located in this stem-loop structure. Apart from this 
conservation in secondary structure the potential tertiary structure 
is also well conserved. The nucleotide sequence of the loop gives 
rise to potential pseudoknot formation with sequences downstream 
of the proposed stem-loop structure. The significance of these 
proposed secondary and tertiary RNA structures is emphasized 
by the presence of covariation. Mutations in one part of the stem 
or in the nucleotide sequence of the loop are compensated by 
mutations in either the stem or in the downstream sequence 
involved in the potential pseudoknot, respectively (Fig. 3B). 

Ribosomal frameshifting in vitro and in vivo 
To prove that the ORFla/ORFlb overlapping region of the MHV 
polymerase gene directs ribosomal frameshifting we cloned this 
region in a mutant M protein gene of MHV-A59 under the control 
of a T7 promoter. Termination of translation of pAPl transcripts 
at the ORFla UAA stopcodon will result in the synthesis of a 


19 kd protein. However, when a —1 translational frameshift 
occurs a protein of 25 kd will be synthesized (Fig. 4A). Direct 
analysis of in vitro translation products revealed both proteins 
(Fig. 4B, lane 1). As expected moab J.1.3, directed against the 
N-terminus of the M protein, immunoprecipitated both the 19 
kd and 25 kd products, indicating that both proteins have a 
common N-terminus (Fig. 4B, lane 2). Only the 25 kd protein 
was specifically immunoprecipitated by the C-terminal anti¬ 
peptide serum (Fig 4B, lane 4). None of the translation products 
were immunoprecipitated by the pre-immune sera (Fig 4B, lane 
3 and 5). Since the methionine residues are only encoded by the 
region upstream of the ORFla/ORFlb overlap the frameshift 
efficiency can be easily estimated. The bands corresponding to 
the 19 kd and 25 kd products were excised from lane 1 (Fig. 
4B) of the dried gel and from the amount of radioactivity an 
efficiency of approximately 40% was calculated. 

To test whether the frameshift signal in the ORFla/ORFlb 
overlap was functional in vivo, Hela cells were infected with the 
vaccinia virus recombinant vTF7-3, expressing the T7 RNA 
polymerase and subsequently transfected with pAPl. Cells were 
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Figure 5. Proportional dot matrix comparison of the amino acid sequences from 
ORFlb of MHV-A59 and of IBV-M42. Numbers of the amino acid residues are 
indicated at the axis. For creating the dot-matrix plot the program 'Compare' 
of the Genetics Computer Group (Wisconsin) (25) was used with a window size 
of 21 and a stringency of 15. 


labelled with [ 35 S]-methionine and cell lysates were 
immunoprecipitated using moab J. 1.3 and the anti-peptide serum. 
Moab J.1.3 specifically immunoprecipitated the expected 
polypeptides of 19 kd and 25 kd from pAPl transfected and 
vTF7-3 infected cells (Fig. 4C, lane 1). These polypeptides were 
not present in lysates from cells that had only been infected 
with vTF7-3 (data not shown). The 25 kd protein was also 
immunoprecipitated by the anti-peptide antiserum directed against 
the carboxyl-terminus of the M protein. (Fig. 4C, lane 3). None 
of these proteins were precipitated by the pre-immune sera (Fig. 
4C, lanes 2 and 4). From these data it was concluded that the 
MHV-A59 ORFla/ORFlb overlapping region was capable to 
induce ribosomal frameshifting both in vitro and in vivo. 

Computer analysis of the corona virus polymerase genes 

Comparison of the predicted amino acid sequence of the products 
encoded by ORFlb of MHV-A59 and EBV-M42 revealed two 
large regions of high similarity (Fig. 5). The positional identity 
in an alignment of the amino acid sequence of ORFlb of MHV 
and IBV is 56%. 

Several short sequence motifs have been identified in particular 
polymerase proteins of RNA and DNA viruses. One motif which 
contains the core sequence ‘GDD’ (38) has also been identified 
in the amino acid sequence of ORFlb of IBV (39). This domain 
is conserved at an almost identical position in the product encoded 
by ORFlb of MHV-A59 (Fig. 6). However, in contrast to the 
‘GDD’ amino acid sequence, both MHV and IBV contain the 
amino acid sequence ‘SDD’ at this position. Although 
occasionally a M, C, V or L residue has been reported in the 
position of the G residue (38), no serine residue has been reported 
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Figure 6. A) Localization of the 'polymerase' (B) and 'helicase' (Q domain in 
the second ORFs of the IBV and MHV-A59 pol genes. B and C) Amino acid 
sequence of the conserved domains identified in ORFlb of MHV-A59. Conserved 
residues are indicated by triangles. Numbers on the left refer to the localization 
of the conserved domains in the ORFlb sequence. Fig 6B; The ‘GDD’ or 
‘polymerase’ motif (38, 39). The position of the serine residue which replaces 
the glycine is indicated with an open triangle. Fig. 6C. The ’GKS/T’ or ‘helicase’ 
motif (40, 41). 


immediately upstream of the two conserved aspartic acid residues. 
Analyzing the ORFlb amino acid sequence of MHV-A59 
revealed the presence of another ‘GDD’ motif at position 
2268 — 2270. Although it cannot be excluded that the ORFlb 
encodes more than one polymerase activity, it is unlikely that 
this ‘GDD’ motif is part of the active site of a coronavirus 
polymerase since the surrounding sequences do not meet the 
criteria proposed by Argos (38). Furthermore this ‘GDD’ 
sequence is not conserved between MHV and IBV. 

The amino acid residues encoded by ORFlb of IBV which 
exhibit similarity to a sequence motif found in a group of proteins 
from different organisms and which are probably involved in 
crucial nucleoside triphosphate dependent steps in nucleic acid 
replication (40, 41) are also present at a nearly identical position 
in the amino acid sequence encoded by ORFlb of MHV-A59 
(Fig. 6). 

No other significant similarities were identified when 
overlapping regions of approximately 300 residues of the MHV- 
A59 ORFlb amino acid sequence were tested using the program 
FASTA (24) and the NBRF/PIR and NBRF/NEW protein 
identification resources (releases 19.0 and 37.0, respectively). 

DISCUSSION 

In this paper the primary structure of the second ORF of the 
putative MHV-A59 pol gene is described. Assuming that the 
organization of the polymerase gene of MHV is identical to the 
equivalent gene of IBV, in which two large ORFs have been 
identified (14), the nucleotide sequence of the small 5' ORF 
presented in this article represents the 3' end of a large ORF. 
This ORF starts at position 210 at the 5' end of the viral genome 
(Fig. 1; ref. 1). 

No similarity has been detected in the predicted amino acid 
sequence of the 5' end of ORF la of the IBV and MHV-JHM 
or MHV-A59 polymerase gene (1, 12). However, the putative 
carboxyl terminal region of the translation product of ORF la and 
almost the complete translation product of ORFlb are well 
conserved among MHV-A59 and IBV. Recendy, we have 
determined a small part (0.3 kb) of the nucleotide sequence of 
the 3' end of the polymerase gene of the feline coronavirus FIPV. 
(De Groot, unpublished results). The deduced amino acid 
sequence of this region was very similar to the carboxyl terminal 
part of ORFlb of MHV and IBV.In contrast to the high similarity 
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in the second ORF of the pol gene, no significant similarity has 
been observed in the amino acid sequence of the other viral 
nonstructural proteins. The structural proteins of coronaviruses 
only show significant similarity in relatively small regions. The 
overall identity in the nucleocapsid, membrane and spike protein 
is 29%, 30% and 35% respectively (reviewed by 4). This strongly 
suggests selective pressure against mutations in the pol gene to 
conserve functional domains. Identical observations have been 
made for the picomaviruses (42), alphaviruses (43), flaviviruses 
(44) and negative stranded RNA viruses like rhabdoviruses (45). 

The ‘S/GDD’ as well as the nucleotide triphosphate binding 
‘GKS/T’ amino acid sequence motif, which are encoded in the 
pol genes of coronaviruses, are also well conserved in the 
polymerases of viruses belonging to the picomavirus-like and 
alphavirus-like superfamily of ( + ) stranded RNA viruses (46, 
47). However, because of the quasi-helical nucleocapsid 
morphology and the expression of the viral genes by multiple 
subgenomic mRNAs, coronaviruses could not be assigned to 
either superfamily of RNA viruses (47). 

During the replication of coronaviruses multiple subgenomic 
RNAs are synthesized to position internal ORFs on the genome 
at the 5' end of an mRNA. No subgenomic mRNA containing 
ORF lb of the MHV polymerase gene at its 5'-proximal end has 
been detected in infected cells. Using an expression vector which 
contained the ORFla/ORFlb overlap of MHV inserted in frame 
within an MHV-A59 mutant M protein gene construct, we were 
able to demonstrate that the ORFla/ORFlb spanning sequence 
was capable of directing ribosomal frameshifting in vitro and in 
vivo. Brierley et al. (15, 16) have shown that the ORFla/ORFlb 
overlap region of IBV is also capable of inducing frameshifting 
in vitro and in vivo. Comparison of the nucleotide sequences of 
the overlapping region of IBV and MHV revealed that the signals 
used for ribosomal frameshifting in coronaviruses are well 
conserved. In both MHV and IBV a stable stem-loop structure 
can be formed downstream of the conserved sequence 
UUUAAAC. This sequence functions probably as the actual site 
for ribosomal frameshifting in MHV since mutations in this 
sequence have been shown to influence the ribosomal 
frameshifting in the IBV ORFla/ORFlb overlapping region (16). 
It has also been shown to function as a site for ribosomal 
frameshifting in Rous sarcoma virus (48). The observed 
covariation between IBV and MHV in the predicted stem-loop 
structure and the pseudoknot underlines the importance of these 
structures in translational frameshifting. Pseudoknots are also 
predicted to be involved in the ribosomal frameshifting for the 
expression of the polymerase gene of many retroviruses and the 
luteoviruses (16, Ten Dam and Pleij pers. communication). 
Recently, it has been shown for IBV that the proposed pseudoknot 
in the ORFla/ORFlb overlapping region is essential for 
ribosomal frameshifting (16). 

Frameshifting is more efficient in the coronaviral system 
(35—40% frameshifting) than in the retroviral system where only 
5 — 10% frameshifting has been observed (49). Ribosomal 
frameshifting is an elegant mechanism for regulating the synthesis 
of several proteins in a well balanced manner. In many 
retroviruses the polymerase is produced after translational 
frameshifting which results in the expression of a gag-pol fusion 
protein (48). Based on sequence comparison, it is postulated in 
this article that the polymerase function of MH V-A59 is encoded 
downstream of the ribosomal frameshifting sequence and 
expressed as a fusion protein. It is tempting to speculate that this 
fusion protein is the actual polymerase. Cleavage of this functional 


polyprotein could result in an inactive pol protein. Such a 
mechanism would explain the observed requirements for 
continuous de novo protein synthesis during the replication of 
MHV-A59 (3, 6). 

The determination of the nucleotide sequence and the predicted 
amino acid sequence of MHV-A59 ORF lb will provide a basis 
for obtaining monospecific antisera against protein(s) encoded 
by ORFlb. These sera will be important for further 
characterization of the proteins involved in the discontinuous 
transcription of coronaviruses. 
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